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We study pathwise approximation of scalar stochastic differential 
equations at a single point. We provide the exact rate of convergence 
of the minimal errors that can be achieved by arbitrary numerical 
methods that are based (in a measurable way) on a finite number 
of sequential observations of the driving Brownian motion. The re¬ 
sulting lower error bounds hold in particular for all methods that 
are implementable on a computer and use a random number genera¬ 
tor to simulate the driving Brownian motion at finitely many points. 

Our analysis shows that approximation at a single point is strongly 
connected to an integration problem for the driving Brownian mo¬ 
tion with a random weight. Exploiting general ideas from estimation 
of weighted integrals of stochastic processes, we introduce an adap¬ 
tive scheme, which is easy to implement and performs asymptotically 
optimally. 

1. Introduction. We consider a scalar stochastic differential equation 

(1) dX(t) = a(t,X(t))dt + a(t,X(t))dW(t), ie[0,l], 

with initial value X)(0). Here W denotes a one-dinrensional Brownian mo¬ 
tion, and a: [0,1] xK->R and o : [0,1] x R —> M satisfy standard smoothness 
conditions. 

In most cases an explicit solution of (1) will not be available so that an ap¬ 
proximation X must be used. Assume that the driving Brownian motion W 
may be evaluated at a finite number of points. Then the following questions 
are of interest: 

1. Where in the unit interval should these evaluations be made and how 
should the resulting data be used in order to obtain the best possible 
approximation to the solution? 
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2. What is the minimal error that can be achieved if at most N evaluations 

of W are made on the average? 

The analysis of these problems clearly needs the specification of an error 
criterion. The two main approaches in the literature are: 

(i) approximation at a finite number of points, that is, X is compared to 
the solution X at finitely many points in the unit interval, 

(ii) global approximation, that is, X is compared to the solution X glob¬ 
ally on the unit interval. 

First results for global approximation are due to Pardoux and Talay (1985) 
who studied almost surely uniform convergence of specific approximations. 
Faure (1992) determines an upper bound with an unspecified constant for 
the average L^-enor of a Euler scheme with piecewise linear interpolation. 
Complete answers (in an asymptotic sense) to the questions 1 and 2 above 
are given in Hofmann, Miiller-Gronbach and Ritter (2001) for the average 
L 2 -error and Miiller-Gronbach (2002b) for the average Loo-error. In these 
papers the exact rate of convergence of the minimum error is determined 
and adaptive methods are presented that are easy to implement and perform 
asymptotically optimally. 

Much less is known for the problem of approximation at a finite number of 
points. Here, the majority of results deal only with upper bounds for the er¬ 
ror of specific schemes at the discretization points; see, for example, Kloeden 
and Platen (1995) for an overview. Lower bounds for approximation at t = 1 
were first presented in Clark and Cameron (1980) who considered an au¬ 
tonomous equation (1) with constant diffusion a = 1 and determined the rate 
of convergence of the minimal mean squared error that can be obtained by 
equidistant evaluation of the driving Brownian motion W. Riimelin (1982) 
studied autonomous equations with a nonconstant diffusion coefficient and 
presented the order of the minimal error that can be obtained by Runge- 
Kutta methods based on equidistant evaluation of W. The most fargoing 
result is due to Cambanis and Hu (1996) who analyzed the mean squared 
error of the conditional expectation of X(l) given observations of W at 
points that are regularly generated by some density. They provided the rate 
of convergence of the corresponding mean squared error and determined the 
optimal density. Clearly, all these results only provide partial answers to the 
above questions 1 and 2. For instance, the implementation of a conditional 
expectation will be a hard task in general. Furthermore, considerations are 
restricted to numerical methods that are based on sampling W at prefixed 
points in the unit interval (either equidistant or regularly generated by some 
density). Adaptive methods which take into account the particular trajec¬ 
tory of the solution are not covered. See Remarks 1 and 3 for a discussion. 

In the present paper we provide a detailed analysis of approximation at 
t = 1 with respect to the questions 1 and 2. Our results cover all numerical 
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methods that are based on the initial value A1(0) and finitely many sequential 
observations 


of the driving Brownian motion W. Except for measurability conditions, we 
do not impose any further restrictions. The kth evaluation point Tk may 
depend on the previous evaluations X(0), W (ti), ..., W(Tk- 1 ) and the total 
number v of observations of W may be determined by a stopping rule. 
Finally, the resulting discrete data may be used in any way to generate an 
estimator 

1(1 )=<f>(W(T 1 ),...,W(T„)) 

of X(l), the solution at t = 1. For example, the adaptive Euler -Maruyama 
scheme recently introduced in Lamba, Mattingly and Stuart (2003) is of this 
type. 

The error of X(l) is defined by 

e p (X(l)) = (E\X(l)-X(l)\?) 1/p , 

where p £ [l,oo[, and c(l(l)) = E(v) is the average number of evaluations 
of W used by X(l). 

Our analysis shows that the problem of pathwise approximation at t = 1 
is strongly connected to an integration problem for the driving Brownian 
motion W with the random weight 

y(t) = M(t,i)-g(t,x(t)), te [o,i], 

where Q = aa^ 0,1 ' 1 — tO 1 - 0 ) — aed 0,1 ) — 1/2 • cr 2 cd 0 ’ 2 ) involves partial derivatives 
of a and <r, and the one-dimensional random field Al is given by 

M.(t, s ) = exp (l 0,1 ) — 1/2 • (ad 0,1 )) 2 )(u,X(u)) du 

+ ^c7 (0 ’ 1) (u,X(u))dlT(u) 

for 0 < t < s < 1. Roughly speaking, M(t, •) is the L 2 -derivative of the solu¬ 
tion X with respect to its state at time f; see Remark 6. 

To give a flavor of our results, let p = 2, and consider the minimal error 

e?(N) = inf{e 2 (l(l)): c(!(l)) < IV} 

that can be achieved by numerical methods using at most N evaluations of 
the driving Brownian motion on the average. By Theorem l(i), 

jin^ N • er (AO = 1/^12 • (e\ y(t )| 2/3 dt^j ) ^, 
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which answers question 2 in an asymptotic sense. 

For answering question 1 we exploit general ideas from estimation of 
weighted integrals of stochastic processes; see, for example, Ritter (2000) 
and the references therein. We construct an easy to implement adaptive 
scheme X%* n with step-size roughly proportional to \y n {t)\~ 2 ^ ■, where y n is 
a suitable approximation to the random weight y. The resulting approxi¬ 
mation X|^(l) at t = 1 satisfies 

(3) * e 2 (^^(l)) = 1 /a/ 12 - l^(t)| 2/3 

see Theorem 2(i). Consequently, by (2) this method performs asymptotically 
optimally for every equation (1) with a nonzero asymptotic constant on the 
right-hand side above. 

A natural question is whether the asymptotic constant in (3) can also 
be achieved by a numerical method based on a prefixed discretization. The 
answer turns out to be negative in general. In fact, consider the minimal 
error 

e 2 (N) = m£{e 2 (E(X(l)\W(t 1 ), ..., W(t N ))) : 0 < h < • ■ ■ < t N < 1} 

that can be obtained if the driving Brownian motion W is evaluated at N 
prefixed points in the unit interval. By Theorem l(iii), 

/ r 1 , \ 3/2 

(4) lim N-e 2 (N) = l/Vl2-[ (E\y{t)\ 2 ) 1 / 3 dt) . 

N^oo \J o / 

Thus the order of convergence is still 1/N but the asymptotic constant 
in (4) may be considerably larger than the asymptotic constant in (2); see 
Example 1. Somewhat surprisingly, as a by-product of (4), it turns out that 
in general the Milstein scheme does not perform asymptotically optimally; 
see Remark 7. 

In Section 2 we state our assumptions on equation (1). We use global 
Lipschitz and linear growth conditions on the drift coefficient a, the diffusion 
coefficient a and partial derivatives of these coefficients, as well as a moment 
condition on the initial value A(0). 

Best rates of convergence for approximation at t = 1 based on point evalu¬ 
ations of W are stated in Section 3. More specifically, we analyze the minimal 
errors that can be achieved if W is evaluated at: 

(a) sequentially chosen points with E(v) < N, 

(b) sequentially chosen points ti, ... ,t u with v < N, 

(c) prefixed points t\, ..., f/v, 

(d) equidistant points 1/N,2/N, ..., 1. 

In Section 4 we introduce a new class of numerical schemes, which leads to 
asymptotically optimal approximations for each of the cases (a)-(d) above. 
Proofs are postponed to Section 5 and the Appendix. 
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2. Assumptions. We will use the following Lipschitz and linear growth 
conditions on functions /: [0,1] 

(L) There exists a constant K > 0 such that 

\f(t,x) - f(t,y)\<K- \x — y\ 

for all t G [0,1] and x, y G R. 

(LG) There exists a constant K > 0 such that 

\f(t,x)\ <K -(l + |x|) 

for all t G [0,1] and x G R. 

(LLG) There exists a constant K > 0 such that 

\f{s,x) - f{t,x )| < I< ■ (1 + |x|) -\s-t\ 

for all s, t G [0,1] and rGl. 

Throughout this paper we impose the following regularity conditions on 
the drift coefficient a, the diffusion coefficient a and the initial value X(0). 

(A) (i) Both a and a satisfy (L) as well as (LLG). 

(ii) The partial derivatives 

of 1 ’ 0 ), 

exist and satisfy (L) as well as (LLG). 

(iii) The functions a 2 a,(°’ 2 ') and cr 2 a^ 0,2 '> satisfy (LG). 

(iv) The function aa^ 0,2 ^ satisfies (L). 

(B) The initial value A'(O) is independent of W and satisfies £'|A(0)| 16p < 
oo. 

For instance, (A) is satisfied if the partial derivatives 

* = 0,1,2, j = 0,1,2,3, 

exist and are continuous and bounded. 

Note that (A) together with (B) implies that a pathwise unique strong 
solution of the equation (1) with initial value A!(0) exists. In particular, the 
conditions assure that 

(5) e( sup |X(t)| 16p ^) < oo 

V o<t<i / 

as well as 

(6) E\X(s)-X(t)\ iep <c-\s-t\ 8p , 

where the constant c > 0 only depends on p and the constants from (A) and (B). 
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3. Best rates of convergence. We consider arbitrary numerical methods 
for pathwise approximation of the solution X at the point t = 1 that are 
based on a realization of the initial value X(0) and a finite number of ob¬ 
servations of a trajectory of the driving Brownian motion W at points in 
the unit interval. The formal definition of the class of these methods and 
subclasses of interest is given in Section 3.1. Section 3.2 contains the analysis 
of the corresponding minimal errors. 

3.1. General methods for approximation at t = 1. A general adaptive 
approximation A(l) of X(l) is defined by three sequences 

V* = (^n)ne Ni X = (Xn)n£ N> 4* = ( < ^ , n)n£N: 

of measurable mappings 

i/j n : M n —> ]0,1], 

Xn : R n+1 —s- {STOP, GO}, 

(fn : M n+1 -»• R. 

The sequence if determines the evaluation sites of a trajectory wofW in the 
interval ]0,1]. The total number of evaluations to be made is determined by 
the sequence x of stopping rules. Finally, f> is used to obtain the real-valued 
approximation to the solution X at t = 1 from the observed data. 

To be more precise, the sequential observation of a trajectory w starts at 
the knot ipi(x), where x denotes the realization of the initial value. After 
n steps we have obtained the data D n (x,w) = (x, yi ,..., y n ), where y\ = 
w(ifi(x)), ...,y n = w(if n (x, yi, ..., 2/n-i)), and we decide to stop or to further 
evaluate w according to the value of Xn{D n (x,w)). The total number of 
observations is thus given by u(x, w) = min{n £ N: Xn{D n (x, w)) = STOP}. 
If v(x,w) < oo, then the whole data D(x,w) = D v r XiW \(x,w) are used to 
construct the estimate 4> u ^ x ^(D(x,w)) £ M. 

For obvious reasons we require zz(X(0), W) < oo with probability 1. Then 
the resulting approximation is given by 

X(l)=^ ixmv) (D(X(0),W)). 

As a rough measure for the computational cost of X(l) we use 

c(X(l))=E(v(X(0),W)), 

that is, the expected number of evaluations of the driving Brownian mo¬ 
tion W. Clearly, a more realistic measure also involves, for example, a count 
of the arithmetical operations needed to compute X{\). 

Let X** denote the class of all methods of the above form and put 

Xff = (A(l) £ X** : c(X( 1)) < N}. 
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Then 

e* p *(N) = ini{e p (X(l)):X(l)eX* N *} 

is the minimal error that can be obtained by approximations that use at 
most N sequential observations of W on the average. 

The number and the location of the evaluation sites that are used by 
an approximation X(l) G X** depend on the respective realization x of the 
initial value A(0) and the path w of the driving Brownian motion W. It is 
natural to ask whether, in general, the minimal errors e**(N ) can (asymptot¬ 
ically) be achieved by methods that use the same evaluation sites for every 
trajectory of W. In order to investigate questions of this type, we introduce 
the following subclasses of X** that are subject to certain restrictions on 
the choice of evaluation sites. 

The subclass X* C X** consists of all approximations that use the same 
number of observations for every x and w. Formally, this means that the 
mappings Xn are constant and v = min{n G N: Xn = STOP}. 

Additionally, we consider the subclass X C X* of all approximations that 
evaluate W at the same points for every x and every path w. Formally, the 
mappings and Xn are constant so that u = min{n G N: Xn = STOP} and 
D(x,w) = (x, ..., w(il> u )). For instance, if the discretization is fixed, 

then the corresponding Euler scheme and the Milstein scheme at t = 1 belong 
to the class X. 

Finally, the class A eqm c X consists of all approximations that use equidis¬ 
tant evaluation sites for the driving Brownian motion. 

The definition of the respective classes X^, Xn, and the correspond¬ 
ing minimal errors e*(N), e p (N ) and e® qm (A r ) is canonical. 

We stress that the class X** contains all commonly studied methods for 
approximation at t = 1 that are based on function values of the driving 
Brownian motion. Formally, the corresponding sequences i/j, x and (j) then 
depend on the respective drift coefficient a and diffusion coefficient a. In the 
majority of cases, partial information about the coefficients, for example, 
finitely many function values or derivative values, are sufficient to compute 
the approximations X(l). 

In the present paper we present (asymptotically) sharp upper and lower 
bounds for the minimal errors defined above. The upper bounds are achieved 
by methods that also need only partial information about a and a. On 
the other hand, no restriction on the available information about a and a 
is present in the definition of the class X**. Therefore, the lower bounds 
hold even for strong approximations that may specifically be tuned to the 
respective coefficients. As an example, consider an approximation of the 
form 

X(l) = E(X(l)\W(h),...,W(t N )), 
which belongs to the class Xjy and might even not be implementable. 
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3.2. Analysis of minimal errors. Let m p denote the pth root of the pth 
absolute moment of a standard normal variable, that is, 

/ r oo \ 1/p 

m P=yJ \y \ P /( 27T ) 1/2 -exp(-y 2 /2)dyj . 


Recall the weighting process y from Section 1 and define the constants 


f f r l . \3p/2(p+l)\ 

C;* = m p -I^E^J o \y(t)\ 2 / 3 dtj J 

c;=m, ■ i:k*)i 2/3 *) 3p/2 ) 1/p , 

/ />1 \ 3/2 

C 2 =(J o (£|T(t)| 2 ) 1/3 dtj , 


(p+i)/p 


Theorem 1. The minimal errors satisfy: 

(i) limjv^oo N • e**(N) = C;*/Vl 2 , 

(ii) lim^oo N ■ e* p (N ) = C;/Vl2, 

(hi) limjv^-oo N • e 2 (iV) = C 2 /VY 2 , 

(iv) lim^v—>oo iV • (JV) = C^ ui /v^2. 


Clearly, the asymptotic constants vanish altogether iff C^ 111 = 0. Thus, if 
C* 2 qm > 0, then the order of convergence of the minimal errors is l/N for all 
of the above classes. However, note that 

C** < c; < C° qui , C 2 ** < C 2 * <C 2 < C2 qui , 

with strict inequality in most cases. See Remark 2 for the case C' 2 qm = 0 and 
Remark 4 for a characterization of equality of the asymptotic constants. 


Example 1. Consider the linear equation 

dX(t) = aft) ■ X(t) dt + P(t) • X(t) dW(t) 

with initial condition A(0) = 1. Clearly, condition (A) is satisfied if a and 
/3 have Lipschitz continuous derivatives a' and /T, respectively. The corre¬ 
sponding field A4 is given by 

A4 (t, s) = exp (a — 1/2 • j3 2 ){u) du + J j3(u)dW(u )^ , 
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and we have X(t) = A4(0,t). Moreover, Q(t,x) = • x, so that the 

weighting process y satisfies y(t) = -/3'(t)-M(0,t) ■M(t , 1) = — j3'(t) • X(1). 
Straightforward calculations yield for gGR \ {0}, 

(E\X(l)\ q ) 1/q = e ll“H 1-1 / 2 ’ll^lll • e «/ 2 -ll^lll. 

Thus 

C** = m p • e Mi-VHP\\ 2 2 . || / 3/|| 2/3 . e P/2( P +i)-ll/3|ll ) 

c; = m p - e ll“lli- 1 / 2 -ll/ 3 lli . \\p'\\ 2/3 . e vini3\\l^ 

C2 = C* 2 , 

cp ui = m p • e ll Q ll 1 - 1 /2-ll/3|l| . ||^|| 2 . e p/np\\l' 

If a = 0 and (3(t) = 6 •1 with b G M, then 

a] qui = C 2 = C 2 * = |6| • e b2 / 6 

and 

C? = \b\ ■e~ 2b2/9 , 

which shows that adapting the number of evaluations of W to the particular 
trajectory of the solution X is essential in this case. Note that the constant 
C 2 * is achieved by the adaptive method to be introduced in Section 4.3.1. 
Thus, if, for example, 5 = 5, then, asymptotically, the error of this method 
is at least 1/258 times smaller than the error of any approximation based 
on a fixed number of evaluations of W. 

Remark 1. Clark and Cameron (1980) consider the autonomous equa¬ 
tion 


dX(t) = a(X(t))dt + dW(t), X(0) = x£R, 
where a has bounded derivatives up to order 3. They obtain 
lim N • (E\X(1) - E(X(1)\W(1/N ),..., 1T(1))| 2 ) 1/2 

iV—>oo 

J E^a'(X(t))) 2 ■ exp ^2 • a'(X(s)) ds^j^ dt^j j \J T2. 

Note that the corresponding weighting process is given by 
y(t) = a'(X(t))-exp^J a'(X(s))ds 
so that the above result is a consequence of Theorem l(iv). 
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More generally, Cambanis and Hu (1996) study autonomous equations 

dX(t) = a(X(t)) dt + a(X(t))dW(t), X(0) = x € R, 

where a and a have bounded derivatives up to order 3. They analyze the 
minimum error that can be achieved by methods from the class X that are 
based on so-called regularly generated discretizations. 

To be more precise, let h be a strictly positive density on [0,1] and define 
the discretization 

0 < < ■ ■ • < t^ ] = 1 

by taking the //iV-quantiles corresponding to h, that is, 

t W 

1 h{t)dt = l/N , l = l,...,N. 

Jo 

Consider the optimal approximation in the mean squared sense 
that is based on the observations W(t ^),..., W (t$) and put 

/ rl \ 1/2 

c , W = (jf o E\y{t)\ 2 /h 2 (t)dtj . 

If h has a bounded derivative, then 

lim N-e 2 (X%\l)) = C(h)/Vl2. 

N—>o o 

Taking h= 1 yields Theorem l(iv) in the case p = 2, since 
e 2 (lW(l)) = er(JV) and C(l) = C^. 

Taking 

h*(t) = (E\y(t)\ 2 ) 1/3 / j\E\y{s)\ 2 ) 1/3 ds 

yields the minimal constant 

C 2 = C {h*) = inf C(h). 

h 

Thus, by Theorem l(iii), the approximation X y N ; (1) is asymptotically opti- 

^(h*) 

mal in the class X if C 2 > 0. However, note that the method X y N ' (1) is much 
harder to implement than the asymptotically optimal method introduced in 
Section 4. 
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Remark 2. Theorem 1 determines the rates of convergence of the min¬ 
imal errors only in the case of nonzero asymptotic constants. Clearly, these 
constants vanish altogether iff with probability 1, 

(7) Q(t,X(t)) = 0 for every t€ [0,1]. 

For a large class of equations, it turns out that (7) holds iff there exists a 
measurable function / :Rx [0,1] xR->R such that, with probability 1, 

(8) X(t) = f(X(0),t,W(t)) for every t € [0,1]. 

Obviously, if (8) holds, then X(l) can be simulated exactly. Thus (8) implies 
(7) by Theorem 1. 

Clark and Cameron (1980) provide sufficient conditions for the equiva¬ 
lence of (7) and (8) in the case of autonomous equations. Slightly modifying 
their approach, one can also treat general equations. If, additionally to as¬ 
sumption (A), the conditions 

(i) a and cd 1 . 0 ) are bounded, 

(ii) inf tjX \cr(t,x)\ > 0, 

are satisfied, then (7) and (8) are equivalent. 

The equivalence of (7) and (8) also holds for the linear equation from 
Example 1. Note that condition (ii) above must not be satisfied in this case. 
However, (7) implies fi' = 0 and therefore 

A(t)=exp^y (a — 1/2 ■ (3 2 )(u) du + (3{t) ■ W(£)^, £€[0,1]. 

Finally, assume that a and a have partial derivatives of any order. Then, 
by a general result of Yamato (1979), (8) is equivalent to 

(6*) g = 0, 

which clearly implies (7). 

Note that (6*) implies that the Wagner-Platen scheme only uses function 
values of the driving Brownian motion; see Section 4. Thus the order of 
convergence of the minimal errors e® qm (N) is at least 1/N 3 / 2 in this case. 

Remark 3. Riimelin (1982) analyzes a class X of Runge-Kutta methods 
based on an equidistant discretization, that is, 

X C Y equi , 

with respect to the mean squared error at t = 1, that is, p = 2. For this class 
Riimelin shows that, under stronger conditions on a and a, the order of 
convergence of the corresponding minimal errors is 1/N iff Q d 0. Moreover, 
if Q = 0, then the order is at least 1/N 3 / 2 . 
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Remark 4. We briefly comment on equality of the asymptotic constants 
in the case p = 2. Clearly, C 2 = iff there exists 7 6 R such that 

E(y{t)f = 1 for all t€ [0,1]. 

Furthermore, = C 2 iff there exist to £ [0,1] and a function 7 G C([0,1]) 
such that, with probability 1 , 

y(t) = 'y(t)-y(t 0 ) for all tG [0,1]. 

Note that the latter condition holds for the linear equation from Example 1 
with 7 = —/3'//3'( 1) and to = 1- 

Finally, by the Markov property of X, we have iff there exists 

a function 7 G C([0,1]) such that, with probability 1, 

y(t) = 7 (t) for all t G [0,1]. 

In particular, if a and a are state independent, then C|* = (7| = C 2 = 

IMI2/3- 


Remark 5. Theorem 1 shows that pathwise approximation at a single 
point is strongly connected to weighted integration of a Brownian motion. 
To be more precise, let p: [0,1] —> [0, 00 [ be continuous, and consider the 
problem of estimating the weighted integral 

I = I p(t ) • W (t ) dt 
Jo 

of a Brownian motion W on the basis of N observations of W in the unit 
interval. The corresponding minimum mean squared error 

e(N) = inf {(E(I - E(I\W(h ),..., 1F(Cv ))) 2 ) 1/2 :0 < h < • • • < t N < 1} 

satisfies 

lim N ■ e(N) = 1/a/ 12 • c p , 

N —>00 

where 

c P ={S Q m)^dt) ; 

see Ritter (2000) and the references therein. 

Taking the weight 

p(t) = (E(y\t))) 1/2 , te [ 0 , 1 ], 

yields the constant C 2 in Theorem l(iii). 

Using the random weight |T|, we obtain the random constant c\y\, and 

^** _ /TPf 2 / 3 sn 3/2 
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As an illustrating example, consider the linear equation with additive 
noise 


dX(t ) = a(t) dt + a(t) dW(t). 


Then 


A( 1 )=X( 0 ) + f 1 a(t) dt + cr(l) -W{\) — /V(t) 

Jo Jo 


W{t) dt. 


Since A(0) and a(l) • W{ 1) can be observed, we are basically dealing with 
the approximation of the last integral on the right-hand side above. Clearly, 
in this case the weighting process y is nonrandom with y = —o'. 


Remark 6 . Consider, for every and t E [0,1], the solution X tjX of 

the equation 

dX t)X (s) = a(s, X t>x (s)) ds + o(s, X tjX (s)) dW ( s ), t<s< 1, 

with initial value Xt )X {t) = x. As a well-known fact, the distribution of the 
process X tjX on C([t, 1]) coincides with the conditional distribution of the 
solution X(s), t < s < 1, given X(t) = x. Due to condition (A), for every 
s > t, there exists the ^-derivative X' tx (s) of X tjX (s) with respect to the 
initial value x, that is, 

lim E(l/h • (X t , x+h (s) - X t , x (s)) - Xljs )) 2 = 0. 

h—> 0 

Moreover, the process X' t x is the unique solution of the equation 

dX' iX (s) = a( 0 ’^( S ,X t>x ( S )) ■ Xl x (s)ds + o(°^( S ,X t:X ( S )) ■ X' iX (s) dW(s), 

t<s< 1 , 

with initial value X[ x {t) = 1, and is explicitly given by 

X t,x( s ) =6Xp (/ ( a(0,1) _ V 2 ‘ (cr (0,1) ) 2 )(u,Xt,x(u))du 

+ j a tr^(u,X tiX (u))dW(u)y, 

see, for example, Friedman (1975) and Karatzas and Shreve (1991). Replac¬ 
ing Xf tX by the solution X yields the defining equation for the field M. 

4. Asymptotically optimal adaptive schemes. Let k E N and consider the 
equidistant discretization 

(9) ti = l/k, l = 0,.. ., k. 

Our adaptive method basically works as follows. First, we evaluate the 
driving Brownian motion at a coarse grid (9), and we compute a correspond¬ 
ing truncated Wagner-Platen scheme as well as a discrete approximation to 
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the weighting process y. Following the main idea for nonrandom weighted 
integration, the latter estimate determines the number and the location of 
the additional evaluation sites for the driving Brownian motion. The re¬ 
sulting observations are then used to obtain a suitable approximation of 
the difference between the Wagner-Platen scheme and its truncated ver¬ 
sion. Finally, we update the truncated Wagner-Platen scheme by adding 
this approximation. 

For convenience we briefly recall the definition of the Wagner-Platen 
scheme Jf/ vp corresponding to the discretization (9). This scheme is defined 
by XWP(O) =X(0) and 

X™ p (t l+1 ) = X™ P (U) +a(t l ,X^ p (t l )) • (t l+1 - t t ) 

+ a{t h X^ p {t l ))-{W{t l+l )-W{t l )) 

+ 1/2 • {aa^){t h X^ p (t t )) • «W(t l+ i) - W(t,)) 2 - (t l+1 - t,)) 

+ (erf 1,0 ) + aij(O’i) — 1/2 • <t ( (j(o , i) ) 2 ) 

x (t,,X, WP (t z )) • (W(t l+ 1 ) - W(u)) • (t i+1 - t,) 

+ 1/6 • (cj(cj (0 ’ 1 ^) 2 + CxV 0 ’ 2) )(fy, X^ p (ti)) • ( W{t l+ 1 ) -W{tl)f 
+ 1/2 • (a (1 ’ 0) +aa (0>1) + 1/2 • cr 2 a (0 ’ 2) )(fy, X^ p (ti)) ■ (t i+l - t t ) 2 

+ Q{t u X, WP (t,)) • [ tl+1 ( W{s) - wm ds 

Jtt 

for l = 0,..., k — 1; see Wagner and Platen (1978). For the definition of this 
scheme in the case of a general system of equations, we refer to Kloeden and 
Platen (1995). 

We stress that in general the Wagner -Platen approximation Jf/ vp (l) at 
t = 1 does not belong to the class X since function values as well as integrals 
of the trajectories of the driving Brownian motion are used. 

4.1. The truncated Wagner-Platen scheme X^ Pt . Dropping the last sum¬ 
mand in the definition of the scheme above, we obtain a truncated version 
of the Wagner-Platen scheme that is based, only on function values of 
the driving Brownian motion. Formally, is defined by X/ VPt (0) = X (0) 

and 

X™ p \t l+1 ) = X™ p \ tl ) + a(t h X^ p \ tl )) ■ (t l+1 - t,) 

+a(t l ,xr t m-(w(t l+1 )-w(t l )) 

+ 1/2 • ■ ((W(t l+1 ) - W(t t )) 2 - (t l+1 - *,)) 

+ (erf 1,0 ) + acrf 0,1 ) — 1/2 • cr(cr( 0,1 )) 2 ) 
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X • (W(t l+1 ) - W(t t )) ■ (t l+1 -t{) 

+ 1/6 • (a^ 0 ’ 1 )) 2 + • (W(t l+1 ) - W(ti)) 3 

+ 1/2 • (a^ + aa^ + 1/2 ■ a 2 a^)(t u X^ p \t t )) ■ fa+i - 
for l = 0,..., k — 1 . 


4.2. The discrete approximation 34 of the random weight y. Note that 
the random field M satisfies the stochastic differential equations 

dA4(t , s ) = a/ 0 , 1 )(s, X(s)) ■ A4(t, s) ds + c4 0 , 1 )(,s, 3f(s)) • A4(t, s) dW ( s ), 

t <s< 1 , 

with initial value 


.A/f(f,f) = 1 


for every t G [ 0 , 1 ]. 

Using the truncated Wagner-Platen estimates, we thus obtain the follow¬ 
ing Euler-type approximation to the field AA. Put 

mi = (1 + a(°' 1 Xti,X^ p \t l ))-(ti +l -t l ) + a {0 ’ 1 \ti,X^ p \ti))-(W(ti +l )-W(t l ))) 
and define the scheme AA k by 


AA k (ti , t r ) 


fhl ■ ■■fhr-i, 

1 , 


if l + 1 < r < k, 
if r = 1. 


Now, for l = 0,..., k — 1, we take 

Mti) = M k (t l+ 1 ,1) • QituX^iU)) 


as an approximation to y(ti). Note that, in general, all of the observations 
W(t\), ..., W(l) are needed to compute the estimate y k (U )■ 


Example 2. Consider the linear equation with additive noise from Re¬ 
mark 5. In this case, we have 

AA k (t u t r ) =AA(ti,t r ) = 1 

and 

yM = y(u ) = -<j'(ti). 

For the linear equation from Example 1, we obtain 

y k (ti) = -p\ti)-x™ pt (t l ) 

k -1 

n (1 + “(4-) ' (4-+1 - U) + P(U) ■ (W(t r+ 1) - W(t r ))). 
r=l+1 


X 
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4.3. The basic adaptive scheme Xj?. Choose measurable functions 

fl :R k ^N 0 

for l = 0,..., k — 1. The numbers 

hi = MMto), ■ ■ ■ ,34(4-i)) 

determine the adaptive equidistant discretizations 

Ti,r = ti + r/(k- {hi + 1)), r = 0,... ,m + 1, 
of the subintervals 

Next, the totality of observations W{rp r ) is used to estimate the difference 
Put pi = (pio ,..., hk~ i), and let denote the piecewise linear 
interpolation of W at the sites Tp r . Define the scheme Qf. by Q^{ 0) = 0 and 

Qfc(*i+1) = (f + a {0,1) {ti,X^ Pt (ti)) • (t i+1 -ti) 

+ aW(t h X™ Pt (t l )) • ( W(t l+1 ) - Wm) ■ Qjtfa) 

+ g(t l ,X™ pt (t l )) • f l+ \w^{s)-W{t l ))ds 

for l = 0,..., k — 1. Note that 

i-1 

(11) Q£( tl )=J2Mtr)- 

r=0 

Finally, the basic scheme X? is defined by 

= X™ pt { tl ) + Q£(f,), l = 0,..., k. 

The resulting approximation Xj^(l) belongs to the class X** and is deter¬ 
mined up to the parameters k and /r. Clearly, the number k of the non- 
adaptive evaluation points should be small compared to the total number 
hi °f the adaptively chosen points in order to keep track of the random 
weight T- On the other hand, k must be large enough to obtain a sufficiently 
good approximation Tfc to y. Finally, the number pi of observations of W in 
the interval ]ti,ti+i[ should be chosen according to the respective local size 
of T- We present three versions of X%(1) that are based on this principle. 

4.3.1. The scheme X** n with varying number of observations ofW. Choose 
a sequence k n £ N such that 


ftr+1 


{W^{t)-W{t r ))dt. 
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and put 


Let 


(n) 

H = 


( 1 ^ \ 3/2 

= y-jr E i^n(*/)i 2/3 J • 


i kn 1 


"■ l^(^)| 2/3 / E |3U*r)| 2/3 • 04J P/(P+1) 


r =0 


o, 


and define 


where 


^p,n ~ ^fcn ’ 


if 34„ > o, 

otherwise, 


,.(n) _ ( An) An) , 

t 1 — V/h) ) ■ • ■ > Mfen-lh 

(n) 

Note that the numbers m ; crucially depend on the error parameter p. If 
p = 2, then 

/h W = L^/^n- |^fc„(fz)| 2/3 J- 


If p 7^ 2, then all of the approximations yk n (U) have to be computed before¬ 
hand in order to determine the adaptive discretization. 

Finally, we mention that the total number of evaluations of W that are 
used to obtain the approximation X** n (l) is roughly given by n ■ S p ^ p+1 \ 
where 

is the pathwise 2/3-seminorm of the weighting process y. In general, this 
quantity depends on the trajectory of T so that there is no a priori bound 
on the computation time available for the user. If all approximations have 
to be computed in the same amount of time, the following version X* of 
the basic adaptive scheme can be used. However, note that a price has to 
be paid for this property; see Theorem 2. 


4.3.2. The scheme X* with fixed number of observations of W. In con¬ 
trast to the scheme X** n , the adaptive discretization used by the scheme X* 
does not depend on the error parameter p. Let 



(n 

L(n 


K)-\y K (ti) | 2/3 / 


in J- 

E l5U‘r)| 2/3 


r=0 


if y kn > o, 


otherwise 
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and define X* = ) , where fj,^ is determined by ,..., • 

By definition, 


k n 1 


n — k n <k n + ^2 ^ - n 

i =o 

holds for the total number of observations, so that the resulting approxima¬ 
tion A*(l) belongs to the class X*. 

4.3.3. The scheme X n with prefixed discretization. Replacing the quan¬ 
tities \y kn (ti)\ by 

(S|^(tO| 2 ) 1/2 

in the definition of the numbers /h ; in Section 4.3.2, we obtain the scheme X n , 
which uses the same discretization for every trajectory of the weighting pro¬ 
cess 3h The resulting approximation X n (l ) thus belongs to the class X n . 
Note that this method requires the computation of the second moments 
of y , which might be a difficult task in general. 

4.4. Error analysis of the adaptive schemes. Now we investigate the 
asymptotic performance of the approximations X** n (l ), A*(l) and X n (l). 
Additionally, we consider the scheme 

x^equi _ y-0 

which only uses the observations VC(l/n), W(2/n ),..., W(l) of the driving 
Brownian motion W. Thus, A® qm (l) 6 A® qm . Note that A® qm is given by 

xrv/n) = ^ P *(i/«) + A £ y n (t r ) ■ (W{(r + l)/n) - W(r/n)) 


2 n 


r =0 


for l = 0 ,..., n. 

Recall the constants C**, C*, C 2 and Cp qui from Section 3.2. 

Theorem 2. The adaptive schemes X** n , X*, X n and the equidistant 
scheme X® qm satisfy: 

(i) lim n _ KX) c(X*^(l)) • e p (X** n (l)) = \/VV2 • C* p *, 

(ii) lim n ^ 00 n-e p (A*(l)) = l/v / l2 -Cp, 

(hi) lim^oon • e 2 (X n (l)) = 1/VT2 • C 2 , 

(iv) lim™ n • e p (jQ qui (l)) = 1/^12 • C p equi . 

Combining Theorem 2 with Theorem 1 from Section 3.2, we immediately 
obtain 
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Theorem 3. Assume C^ m > 0. Then the schemes X** n , X* and X® qm 
are asymptotically optimal for pathwise approximation att = 1 in the respec¬ 
tive classes of methods X**, X* and T eqm . Moreover, if p = 2, then X n is 
asymptotically optimal for pathwise approximation at t = 1 in the class X. 

Remark 7. We stress that, in general, the asymptotic constants C 2 /VT 2 
and C^ m /y/\2 cannot be achieved by the Milstein scheme. As an example, 
consider the equation 

dX(t) = a(t)dW{t), X(0) = 0, 

with a G C 1 ([0,1]). For a discretization 

0 = f 0 < <t n = 1, 

the corresponding Milstein scheme is given by X^ i„(0) = 0 and 
i~i 

(*z) = E *(*r) • mtr+l) - W(t r )), l = l,...,n. 

r =0 

Straightforward calculations yield 

n —1 

(13) MX" ))) 2 = 1 5>'({,)) 2 • (*1+1 - *i) 3 

1=0 


with ti<£i< ti- |_i, so that 


(n —1 


n 




«=0 



3 


by the Holder inequality. Consequently, 


liminfn- inf ^{X^f t (l))>-^=-( j \cr'(t)\ 2 ^dt 
n^oo 0<ti<-<t„=l v *1 v ^3 \Jo 


Ch 

V3‘ 


Thus, whatever the discretization, the resulting Milstein scheme asymptoti¬ 
cally performs suboptimally with respect to pathwise approximation at t = 1 . 

Similarly, for the equidistant Milstein scheme X^f we obtain from (13) 
that 


lim 

n— >og 


n-e 2 (X^( 1 )) 


1 

71 



|cr , (f )| 2 dt 


\ 1/2 


J 


£tequi 


73 ' 
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5. Proofs. We introduce an auxiliary scheme X k corresponding to the 
equidistant discretization ti = l/k, l = 0 ,... ,k, and separately analyze X(1) — 
X^ ux (l) and X^ ux (l) — X(l) for a method X(l) € X**. The scheme X^ ux is 
defined by 

xT x (t i) = X fc WPt (^) + Q k (ti), 1 = 0, 

Here X™ Pt is the truncated Wagner-Platen scheme (see Section 4.1) and 
the scheme Q k is given by Q k ( 0) = 0 and 

Qk(ti+i) = (l + a ( -°’ 1 \ti,X^ Pt (ti)) ■ (ti +1 -ti) 

+ a^ 1 \t l ,X™ p \t l )) • (W(t l+1 ) - W(k))) -Q fc (t,) 

+g(t l ,x^ p \t l )) • [ tl+ \w(s) - wmds 

Jtl 

for l = 0,..., k — 1. Note that 

l -1 

(14) Q k (t l )=J2Mtr)- 

r —0 

Due to Lemma 12 in the Appendix, we have 

(15) E\X(1)-Xl ux (l)\P = 0{k~ 3 P/ 2 ). 

Thus, asymptotically £'|X^ ux (l) — X(l)| p will be the dominating term if k 
is chosen suitable as a function of c(X(l)). 

We briefly outline the structure of this section. Basic facts on moments of 
integrated Brownian bridges are stated in Section 5.1. Section 5.2 contains 
error bounds for the discrete approximation y k of the random weight T- 
The lower bounds in Theorem 1 are proven in Section 5.3. The matching 
upper bounds in Theorem 2 are proven in Section 5.4. 

Throughout the following we use c to denote unspecified positive constants 
that only depend on the error parameter p and the constants from conditions 
(A) and (B) in Section 2. 

5.1. Moments of integrated Brownian bridges. Let B denote a Brownian 
bridge on an interval [S,T] C [0,1]. Straightforward calculations yield 

(16) E^j T B{t)dt^ =1/12 -(T-Sf. 

Furthermore, if 


t r +i 


(W(t)-W(ti))dt. 


S = To < • • ■ <r n = T 
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and B \,..., B n are independent Brownian bridges on the intervals [to, 7i] ,..., [r n _ i , r n ], 
respectively, then 

( n— 1 rT r -L-\ \ ^ 

J2 ' B r (t)dt] > 1/12 • (T - S) 3 ■ l/n 2 
r=0 Jrr J 

by the Holder inequality. 

5.2. Error bounds for the estimates 34• Recall the discrete approxima¬ 
tion M. k of the field M. from Section 4.2. 


Lemma 1. For 0 < l < k — 1, it holds 

E\M(t h l) - M k {t h l)\ 2 P <c/kP. 


Proof. Note that, by boundedness of (4 0,1 ) and c4 0,1 ), 

(18) E\M(ti,si) - M(t 2 ,s 2 )\ q <c-c(q ) • (max(|H - t 2 \, |si - s 2 |)) 9/2 

for all q > 1, 0 < t\ < si < 1, 0 < t 2 < s 2 < 1, where c(q) only depends on q. 
Fix l and define the process JA k (ti, •) on [4,1] by Mk(ti,ti) = 1 and 

M k (ti,t ) = M k {t h tr ) • (l + a^\t r ,xF P \t r )) ■ (t - t r ) 

+ a^ l \t ri Xf p \t r )) • (W(i) -W(tr)j) 

for t G [tr-,, r = l,... ,k — 1. Clearly, M k (ti,t r ) = M k (ti,t r ) for r = 
and boundedness of a^ 0,1 ^ and c4 0,1 ) implies 

(19) £(" sup \M k (ti,t)\A < c-c(q) 

\ti<t< l / 

for every q > 1, where the constant c(q) only depends on q. 

Let t G [4,1]. Due to (10) we have 

\M{t h t) - M k (t h t)\ 2p 


t 1 

<c [ ^2\a^ 1 \s,X(s))M(t h s) 

J h r=l 

-a(°’ 1 )( ir ,Xr t (tr))^ fc (44r)| 2 % r , tr+1 ]( S )^ 

^(cr(°’ 1) (s,X(s))Al(4,s) 


+ C 


(- 

i r=l 


-a( 0 ’ 1 )(t r ,xW pt (t r ))M fc (4,4))%, 4r+l] ( S )dlF( S ) 


2p 


Put P(f) = sup ti<s<t |A / f(4, s) — M. k (ti, s)|. By the Burkholder inequality, 
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t k — 1 

E\V(t)\ 2p <cJ J2 E \ am ( s ’ X ( s )) M (ti,s) 

ti r= i 

- a (0 ’ 1} (t r , X^ Pt (t r )) AT fe (i t , t r ) 1 2p l] tr . )tr+1 ] ( s)ds 

^ J” _ 

+ cj J2 E \vW&XMMfas) 

i-l r= i 

- a^' 1 ) (t r , X^ Pt (t r ))M k (ti,t r ) | 2| 'l] tr , >tr , +l] (s) ds. 

Let s <E [t r ,t r+ 1]. By (A), 

|a (0,1) (s, X(s)) ■M(ti,s ) - a (0,1) (i r ,X^ Pt (t r )) ■ M k (t h t r )\ 

< |a (0,1) (s,X(s)) • (M(t h s) - M(t h t r ))\ 

+ |(a {0 ’ 1) (s,X(s)) - a ( -°’ 1 \t r ,X(s))) • M(t u t r )\ 

+ |(a (0,1) (t r ,X(s)) - (t r ,X(t r ))) ■ M(t h t r )\ 

+ \(a^ 0,1 ^(t r ,X(t r )) - ■ M(tl,t r ) I 

+ |a (0 ’ 1) (t r .,x^ vpt (t r )) • (. M(ti,t r ) - Mk(t h t r ))\ 

<c- |-Ad(f/,s) - M(ti,t r ) | 

+ c• ((1 + |X(s)|) • (s — t r ) 

+ |X( S ) - X(t r )| + \X(t r ) - xr*(tr)\) • \M(t u t r )\ 

+ c-V (s). 

Observing (5), (6), Lemma 10 and (18), we thus obtain 

E\a^ (s, X(s)) -M(ti,8)- (t r , X, WPt (f r )) ■ M k (t h t r )\ 2p 

< C • (E\X(t r )\ Ap + E\X(s) - X(t r )\ Ap 

+ E\X(t r ) - X^ Pt (t r ) | 4p ) 1/2 • (E\M(t h t r )\ Ap ) 1/2 
+ c ■ E\V(s)\ 2p 
<c/k p + c- E\V(s)\ 2p , 

and the same inequality holds with ed 0,1 ) in place of cd 0,1 ). 

Consequently, for every t £ [t;, 1], 

E\V(t)\ 2p <c/k p + c- f E\V(s)\ 2p ds. 

Jt t 

Moreover, by (18) and (19), 
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Thus, Gronwall’s lemma yields 

sup E\V{t)\ 2p <c/k p , 

ti<t< l 

which completes the proof. □ 

Lemma 2. For 0 < l < k — 1, it holds 

£|54(^-3W<c/f/ 2 . 

Proof. Due to (A), 

|g(t^xWPt ( ^ )) -g ( t h x ( tO ) |< (1 + |X ( iO|V|X fe WPt ( ^ ) |2 ) .|xWPt ( ^ ) _ X( ^ ) | 

and 

|a(t/,X fc WPt (tz))|<c-(l + |A, WP (t z )| 2 ). 

Hence, by (5), (18), Lemmas 1 and 10, 

E\y k (ti)-y(tiW 

< c ■ E\(M k (t l+1 ,1) - M(t h 1)) • G(t u X™ pt (f0)| p 

+ c-E\(M(t h 1) • (g(t,,ir(ii)) - Q{tuX{U)))\ v 

< c • (E\M k (t l+1 , 1 ) - M(t h l )| 2p ) 1/2 • (E( 1 + |xw pt (t ; )| 4p )) 1/2 
+ c ■ {E\M(t h i )| 4p ) 1/4 ■ (E\x(ti)\ 4p + £|x^ Pt (t ,)| 4p ) 1/2 

x(E\Xr\ti)-X(ti)\*n 1/4 

<c/k p / 2 , 

which completes the proof. □ 

Put 

« = tEIW l 2/3 , £ = tEI^)I 2/3 - 

K Z=0 K Z=0 

Lemma 3. If 1 < p < 2, f/ien 

|^(^3p/2(p+l)) _ £;(i?3p/2(p+ 1 ))| < c /fcP/2(p+l)_ 

If p> 2, t/ien 

I(^(^ 3p / 2(p+1) )) 2(p+1)/3p - (L;(i? 3 P/ 2 (P +1 ))) 2 ( p+1 )/ 3p | < c/k 1 ^. 

Furthermore, 


| (E(R 3p / 2 )) 2,3p - {E{R 3p / 2 )) 2/3p | < cjkf! 3 . 
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Proof. Clearly, 

k—1 

\R-R\<lE\Mti)-y(ti)\ 2/3 . 

K 1=0 

Assume 1 < p < 2. Then 3p/2(p + 1) < 1 and we obtain 
E \ft3p/2(p+l) _ _^>3p/2(p+l) I < _g| _ ^>|3p/2(p+l) 

/ fc-1 \ 3p/2(p+l) 

£ E (ypA(t,)-W!)l 2/3 J 

/ fc-1 \ 3p/2(p+l) 

< c/F / 2(p+1) 

by Lemma 2, which proves the first inequality. 

Next, let p> 2. Then 3p/2(p + 1) > 1. By Lemma 2, 

| (Li(i?3p/2(p+ 1 ) ))2(p+!)/3p _ ^^3p/2(p+l)^2(p+l)/3p| 

< (E\R — ^| 3 P/ 2 (P+ 1 )) 2 ( J,+1 )/ 3p 

k—1 

< l J2( E \Mti) - T(tOl p/(p+1) ) 2(p+1)/3p 

Z=0 

< c/fc 1 / 3 , 

which shows the second inequality. The third inequality is established in the 
same way. □ 

5.3. Proof of the lower bounds in Theorem 1. Consider an arbitrary se¬ 
quence of methods Abv(l) £ Xff ■ Take a sequence of positive integers k n 
that satisfies 

(20) lim N/k^ 2 = lim k^/N = 0 

N^oo TV—kx) 

and assume without loss of generality that Aat(1 ) uses in particular the 
evaluation sites 

ti = l/kN, 1 = 0, ...,kN- 

Let d\ N ^ denote the number of evaluation points that are used by Ajv(l) in 
the interval ]ti,ti + i[ and put 

/fcjv - 1 

a « = £ ’ + 1)) 2 

V z=o 
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Lemma 4. 

liminfiV • e p {X^{l)) > rrip/12 1 / 2 • liminf N/kH 2 • (E(An)) 1 ^ p . 

N^-oo TV—> oo 


Proof. By (15), 

(21) e p (X N (l)) > (E\Xl n *(l) - X N (l)\ p ) 1/p - c/kf. 

Let 21 at denote the a -algebra that is generated by the data used by Xjv( 1) 
and put Z = W — E(W\%In) as well as 


^ 1 ^ ftl+1 

V = X N (l)-XZ Pt (l)~ E / (E(W(m N )-W(ti))dt. 

1=0 Jtl 

By definition of X^ and (14), we have 


xZ{i)-x N (i)= e y*Ati) 


k N — l 

E 

;=o 


fti +1 


Z{t) dt - V. 


Note that V and the numbers are 21 at- measurable. Conditioned on 21 at, 

the evaluation sites used by -Ajv(l) are fixed and the process Z consists of 
independent Brownian bridges corresponding to the respective subintervals. 
Hence, by (17), 

E(\Xt™(l)-X N (l)\ p \K N ) 


( 22 ) 


>E 


fcftr — 1 


E MM- 


ftl+1 


Z(t) dt 


1=0 




= m p -M 


k N — l 


E y^i) 


rti +1 


1=0 


/k N — l 


= m p 


Z(t ) dt 

fh+i 


2 \ \ p/2 

21 tv 


E (y kN (ti)) 2 -E^j t ‘ +1 z(t)dt 

/fcjv —1 \ 

> K ■ (£ (5U«) 2 • (i2 k% ■ (4"> + dT 1 ) 


2 N \ P/2 

21tv 


p/2 


/k N ~ 1 \ P/ 2 

= m'/12»'' 2 ■ 1 /k 3 S' 2 ■ ( V (%M/(4 N) + 1)) 2 J • 

Combine (21) with (22) to obtain 
lim inf N ■ e p (X]y(l)) 

N —>oo 
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p/2\ 1/p 


/ /fcjv-1 \ p/ \ 

> N/k 3 J 2 ■ \^E ^ E + !)) 2 j j 

Let q = max(2 ,p). Lemma 2 implies 

/ fk N - 1 

e ( E (^(tt)/(d, W + 1 )) 

(k N -l \ v! 2\ 1/p 


/k N -l \ p/2\ 1/p 

l^mlA 2 ) ] -( E(A N )) 1/P 

1=0 ) J 


/ /fcjv-1 \ PI \ 


/kfj—l 


q/2\ 1 /q 


<(^Ei^v(^)-3^)i 2 J j 

(k N -1 \ 1/9 

<^E(^l^iv(^)-3 ; (iz)l 9 ) 2/9 J 


< c. 


Thus, by (20), 


/ /fcAT-l \ p/ 2 \ Vp 

liminfiV/A;^ 2 • ( £7 E + !)) 2 


iV—>00 \ \ , 

\ \ J=0 / / 

> liminf N/kH 2 ■ {E(A N )) 1/p , 

IV—>oo 

which completes the proof. □ 

Now, we analyze the classes X**, X*, d/ eqm and the class X in the case 

p = 2. 


Lemma 5. 

(i) If Xn( 1) £ Xpf for every N, then 

lim inf N/k]l 2 ■ {E(A n )) 1/p > Ct* j m p . 

(ii) If Xn( 1) £ Xf, for every N, then 

lim inf N/k]l 2 • ( E(A n )) 1/p > C* p /m p . 

N—>00 

(iii) If p = 2 and Xn(I) £ %N for every N, then 

lim inf N/k 3 ' 2 ■ {E(A N )) 1/2 > C 2 . 

N —>oo 
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(iv) If Xn{ 1) € ffg qm for every N, then 

liminf N/k% 2 • ( E(A N )) 1/p > Cl qm /m p 


N—>oo 


Proof. By definition of Xff and the Holder inequality, 
N p/(p+i). ( E(A N )) 1/(P+1) 

/ k N — 1 

>(*E (<T’ + Dj 


\ P/(P+B 

v (JV) _Lnl • (E(Htv)) 1/(p+1) 


^-i \ p/(p+i) \ 

,(») , -E , A V(p+i)l 


2 HlS (d! ,+1) ; - ; 

/ /fejv —1 \ 2 /3 /fcjv-1 \ !/3\ 3p/(2(p+l)) 

= e ((E( 4 w, + i)J '(E(W-)/( J "u))j j 

/fcjv-1 \ 3p/( 2 (p+ 1 )) 

>e(I'|W,)I 2/3 J 

Hence (i) follows from 

liminf N/k^ 2 ■ (E{An)) 1 ^ v 
k N — l 


TV—>oo 


g Wit' 3 ) 


\ 3p/(2(p+l))\ (p+l)/p 


1=0 
k/v — 1 


- ( £ (“kt g IW<i)l 2/3 ) 


3p/(2(p+l))\ (p+l)/p 


= C p *7m p . 

By definition of ffg, 


fk\r — 1 


2 /fcjv-1 \\p/ 2 

2 1 \ 


iV^ • > E i(J2 ( d< i N) + 1) • E + !)) 


> 


1=0 

I 


Z=0 


, \ 3p/2 

^EW)l 2/3 J > 


so that 


liminf N/k % 2 • (£(Hjv)) 1/p 


N—hx) 
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/ / -■ fcjv-l \ 3 p / 2 \ Vp 


/ / fcjv-l \ 3p/2\ 1/p 

s K‘rJs f ^ g iw 2/3 J ) 

= C*/m p , 

which proves (ii). 

Next, assume p = 2. By definition of Afjv, the numbers dj are determin¬ 
istic. Thus 

/kjy — l \ 2 /&jV — 1 


/«jv—1 \ z /«iv —1 \ 

JV 2 . E(^ K ) > 1 5 ; ( <f> + 1 )j ■ E E(y( tl )f/(d\ N) + i) 2 j 

/fcjv-l \ 3 

> f E (E(v(t t )) 2 ) 1/3 J ■ 


It follows that 


liminf iV/4 /2 • (£(Hjv)) 1/2 > liminf ( ±- £ (£(T(tg) 2 ) 173 ) = C 2 , 


\ 3 / 2 


A—>oo 


A—>oo \ fc/\r 


«=0 


which shows (iii). 


Finally, by definition of Tgg 11 , the numbers d\ N ^ are deterministic with 


L A 

•tf 0 — 


Hence, 


/ /fcjv-l \p/2\i/p 

iV-(B(4)) 1/? >fciv £ EW)) 2 




«=o 


Consequently, 


few—1 


liminf JV/A # 2 • (^(^)) 1/p > liminf ( e( £ {y{U)f 

A—»oo A—>oo ^ y/cjv 


P/ 2 \ 1/p 


fejv-1 


p/2\ 1/P 


- g W'g 

= CT ""‘/m,, 


which completes the proof. □ 


Combine Lemma 4 with Lemma 5 to obtain the lower bounds in Theo¬ 
rem 1. Clearly, these lower bounds yield the lower bounds from Theorem 2. 
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5.4. Proof of the upper bounds in Theorem 2. Let fcsN and consider a 
basic scheme see Section 4.3. Put 

B* = ( EWl)/<W + 1)) J 

\l =0 



Lemma 6. 

ep{X£(l)) < trip/12 1 / 2 • 1/A; 3 / 2 • (E(B k )) 1/p + c/k 3 / 2 . 
Proof. Due to (15), we have 
(23) e p (xf(l)) < (E\xr(l)-xt:(m 1/p + c/k^. 

By (11) and (14), 

ir x (i)-i[(i)=4(i)-4(i) 


fe-1 


rh+i 


= E^)-/ (W(t)-w p (t))dt. 

1=0 Jt i 

Let 53 denote the a -algebra that is generated by X(0), VU(fi),..., W(l), 
and recall that the adaptive discretization determined by fi consists of the 
03-measurable points 

n,r = h + r/(k - {m + 1)), r = 0,...,//; + 1. 

Conditioned on 53, the discretization is fixed and the process W — con¬ 
sists of independent Brownian bridges corresponding to the respective subin¬ 
tervals. Using (16), we thus obtain 

£(|XT(1)-^(1)H®) 

(24) 

(k -1 \p/ 2 

= mya*/ 2 ■ l/kW ■ (£(&(ti)/(« + i)) 2 J . 

Combine (23) with (24) to obtain the desired result. □ 

Now we turn to the specific schemes X** n , X*, X n and X/ qui . 

Lemma 7. The scheme X** n satisfies 


lim sup (c ( X** n (1))) /n < E 


\y(t)\ 2/3 dtj 


\ 3p/2(p+l) 


and 


3p/2(p+l)\ 1 /p 


limsupn • e p {X** n (l)) < trip/12 1 / 2 • {e\^J^ \y(t)\ 2/3 dtj j 
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Proof. By definition, 


k n 1 


3p/2(p+l) 


c{x;* n (i))<k n + n-E\ v £ |54(^)| 2/3 


Observe (12) and use Lemma 3 to get 


1=0 


/ -t k n 1 

limsup(c(X** n (l)))/n< limsupE — ^ |34(*i)| 2/3 


\k n , 

\ a i .=i 


z=o 

&7T 1 


3p/2(p+l) 


3p/2(p+l) 


< £7 limsup— £ |^)| 2/3 


= E (j\y(t)\ 2 ^d^\ 


\ 3p/2(p+l) 

J 


which proves the first inequality. 

Next, observe that, for the scheme X** n , 

kn 1 


/ k r i 1 


E (««,)/(/>!”’ +1)) 2 < « 2 • • E lh(t,)i 2/3 

1=0 \ 1=0 / 

Hence, 

/ 1 fc„-l \ 3p/2(p+l) 

i/q r/2 • St. < i/n r ■ ( F E ih(t,)i 2/3 ) 


3/(p+1) 


Z=0 


Using Lemmas 6 and 3, we thus conclude that 


limsupn • e p (X** n (l)) < m p /12 1/2 • limsupj e[ ^ |3 7 (^)| 2/3 J 

\ 3p/2(p+l)\ 1/p 


kn 1 


V Vz=o 


^ 3p/2(p+l)^ 1/p 


which completes the proof. □ 

Clearly, Lemma 7 implies the upper bound in Theorem 2(i). 

Lemma 8. The schemes X* and X n satisfy 

limsupn • e p (X*(l)) < C*/y/l2 

n—>oo 

and 

limsupn • e 2 (X n (l)) < C 2 /vT2. 
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Proof. By definition of X*(l), 


kn. 1 


f k n 1 


v +1)) 2 < i/(» - k n ? ■ v \%m 2 ' 3 


1=0 


1=0 


Thus, 


1 /k 3 n p/2 • B kn < 1 /(n - k n f • ( E \Mti)\ 2/3 ) 

V 1=0 

Use Lemmas 6 and 3 to obtain 


\j * P /2 


kn 1 


^ 111 Kn ~ 1 \ 3p/2\ 1/p 

limsup n ■ e p (X*(l)) < m p /12 1/2 • limsup E — E |(V(*/)| 2/3 


\ \ k r 


1=0 


/ / /-l \ 3p/2\ 1/p 

<tnp/12 1/2 -(^(j[ ITO| 2/3 ^J J , 

which establishes the first inequality. 

By definition of X n , the numbers /j[ n ' 1 are deterministic with 

E + l) 2 < 1 /in ~ Kf • ( Y (^|^(^)| 2 ) 1/3 1 ■ 


1=0 

Furthermore, Lemma 2 implies 

I E\y k (u)\ 

(d n, + i) 2 . 

Hence, by Lemma 6, 


i=0 


fen —1 ml Cl t_L M2 \ 1 / 2 /fen-1 771 1-i 1 / j. \ 12 X 1 / 2 


S £ 


£|y(<oi- 

SS (4"’ +1) 2 


+ C. 


kn 1 


n 


e 2 {X n (l)) <m p /12 1/2 -n/(n-k n )- ^ — E ( s l^(^)| 2 ) i/3 ) +c-n/kl /2 . 
Observing (12), we get 


l=o 


/ 1 kn 1 \ 

lim sup n ■ e 2 (X n (l)) < m p /12 1/2 ■ limsup — E (• E ’l3 ; (^)| 2 ) 1/3 

n —>oo n—>oo y J 


\ 3/2 


= m, 


,/i2 1 / 2 -^ 1 (f;|tW| 2 ) 1/3 ) 3/2 , 


which completes the proof. □ 

Lemma 8 yields the upper bounds in Theorem 2(ii), (iii). It remains to 
establish the upper bound in Theorem 2(iv). 
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Lemma 9. The scheme X® qm satisfies 

limsup n • e p (X® qui (l)) < C^ ui /Vu. 

n—>oc 

Proof. Lemma 6 yields 

/ /n—l \p/2\l/p 

e p (^ qui (l))<m p /12 1 /2.i/ n 3/2. ^^|5} n(t0 |2j j +c/n 3/2_ 

Hence, by Lemma 2, 

/ / n—l \ p/ 2\ 1/p 

n.e p (xr 1 (l))<mp/12 1/2 -^(^-ElW i /»)| 2 J J +c/n 1 / 2 . 

We conclude 

lim sup n ■ e p (X® qm ( 1 )) 

n—xx) 

/ 1 n—l \ P/ 2 \ l ! p 

) 

( / /•! \p/2\l/p 

e(j q \y(t)\ 2 dtj j , 

which completes the proof. □ 

The upper bounds from Theorem 2 imply the upper bounds from Theo¬ 
rem 1. 


< rrip/12 1 / 2 • lim sup ( 

n— xx) \ 


APPENDIX 

The goal of this appendix is to establish the error bound (15) for the 
auxiliary scheme X k ' from Section 5. Throughout, we fix a discretization 

0 = to < ■ • ■ < 4 = 1, 

and we put 

Ai = ti + i-ti, A max = max A/. 

l=0,...,k-l 

Moreover, we use J~t to denote the cr-algebra that is generated by X(0) and W (s), 
0 <s <t. Finally, we put 

\\\Y\\\ q = (E\Y\ q ) 1/q 
for a random variable Y and q > 1. 


POINTWISE APPROXIMATION OF SDES 


33 


We start with error bounds on continuous versions of the Wagner-Platen 
scheme and its truncated version. Define processes X wpt and X WP by 
A WPt (0) = A WP (0) = A(0), 

X WP \t) = X WP \ti) + a(ti,X WP \ti)) • (t - ti) + a(ti,X WP \ti)) • (W(t) - W(ti)) 
+ 1/2 • (aa^)(ti,X WPt (ti)) • {{W{t) - W(U)) 2 - (t - t t )) 

+ (erf 1 ’ 0 ) + aerf 0,1 ) — 1/2 • ct(cj-( 0,1 )) 2 ) 

x (ti, x WPt (tO) • (W(t) - W{u )) • (t - ti) 

+ 1/6 • (a^ 0 ’ 1 )) 2 + aV°’ 2 i)(t z ,X WPt (t z )) • (W(t) - W(ti)) 3 
+ 1/2 • (a^ + aa^ + 1/2 • a 2 a^){t h X WPt (t t )) ■ (t - f z ) 2 

and 

X WP (t) = X WP (ti) + a(ti,X WP (ti)) • (t - U) + a(ti,X WP (ti)) • (W(t) - W( t z )) 

+ 1/2 • (aaW^X^iti)) • ((W(f) - W(f z )) 2 - (t - U)) 

+ (<j( 1,0 ) + CKt(O’I) — 1/2 _ ^-((jfO’l)) 2 ) 

x(t z ,X WP (t z )).(W(t)-W(t z ))-(t-t z ) 

+ 1/6 • (o-(cr^ 0,1 ^) 2 + a 2 a^)(t h X WP (ti)) • (W(f) - W(t t )) 3 
+ 1/2 • (a^ 1 ’ 0 ) + aa^ + 1/2 • a 2 a^){t u X WP (t z )) • (t - t z ) 2 

+ <?(f z ,A WP (f z ))- [\w(s)-W(ti))ds 
Jt t 


for t <E [t h t l+ 1 ], 

1 = 0,...,k- 1. 

Lemma 10. 

The processes _Y WPt and JY WP satisfy: 

(i) 

sup £|X WPt (f)| 16p <c, 
te[o,i] 

(ii) 

as well as 

sup E\X WP (t)\ Wp <c, 
te[ o,i] 

(hi) 

sup E|X(t)-X wp ‘(«)| 4 p <c.A 4 m V. 
te[ o,i] 

(iv) 

sup E\X(t)-X WP (t)\ 4 P<c-A^. 
te[ o,i] 


See Kloeden and Platen (1995) for a proof of (ii) and (iv) under much 
stronger assumptions on a and a than stated in (A) in Section 2. For a proof 
of Lemma 10 under condition (A), we refer to Miiller-Gronbach (2002b). 
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Next, define the process Q by Q(0) = 0 and 

Q(t ) = (l + a^ 1 \ti,X WPt (ti)) • ( t-ti) 

+ • (W(t) - ■ Qfa) 

+ • f\w( s ) - wm ds 

Jti 

for t E [ti,ti + i\. Note that Q(ti) = Qk(U) for an equidistant discretization (9). 


Lemma 11. The process Q satisfies 

sup E\Q(t)\ 4p <c- A^ ax 
te[ o,i] 


and 


sup E\Q(t) - Q(ti)\ 4p < c • A 6 £ 
1] 


Proof. Fix t E and let 

17= (1 + a (0,1) (fj,X WPt (ti )) • (t - i,)) ■ Q(t t ), V = Q(t) - U. 

Put q= \2p] and note that 4p < 2q < 8 p. Let r E {1,..., 2q}. Observing (A), 
we have 

(25) E(\V\ r \F tl ) < c • | QM r '(t - ttf' 2 + c • (1 + \X WPt (ti)\ r ) • (t - t*) 3r/2 

as well as 


\U\ r < (1 + c ■ (t - t t )) ■ \Q(ti)\ r . 


Moreover, if r is odd, then 


E(V r \E tl ) = 0. 

Hence, 

E((Q{t)) 2q \Et l ) = U 2q + f; ( 2q ) • U 2q ~ r ■ E(V r \T tl ) 

r =l ' ' 

<(l +c .((_(,)).|Q( t ,)p! 

+c • t (li) ■ iQ(t,)i 2 »- 2 ’' • (i+iv wpi (ti)i 2r ). (t - t ,r. 

r=l ' ' 
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Use Lemma 10(i) to obtain 

iiowiig < a+c• (i - (,»• +c•£ (It) ■ iwiir* <( - ti)3r 

r= 1 x ' 

< (1 + C •(<-«,)) • IIIO(‘i)llg + C • (t -1,) • (IIIQ(ti)lll2, + (i - fi)) 2 ' 

< (1 + c ■ (t - «,)) ■ ||<3(ti)||g + c ■ (t - t,) 2 « +1 , 

so that the first inequality follows from Gronwall’s lemma. 

Due to (25) and Lemma 10(i), 

E\V\ 2 « < c• £|Q(f,)| 2,? • (t - uy + c-(t- ufi. 

Thus, by (A) and the first inequality, 

E\Q(t) - Q(i,)| 2g < c ■ (t - • £|Q(tz)| 29 + c • E\V\ 2 ^ < c ■ A^ ax , 

which proves the second inequality. □ 

Finally, we consider the process 

X aux = ^WPt + q 


Lemma 12. 

The process X aux satisfies 


sup £|X(t)-X““(t)l p <C'A« 2 . 

<€[0,1] 

Note that X aux (t;) = A^ ux (t;) for an equidistant discretization (9). Con¬ 
sequently, Lemma 12 immediately implies (15). 

Proof of Lemma 12. In view of Lemma 10 (iv), it is enough to show 

(26) 

sup E\ A WP (f) - X aux (t)|P < c • A^/ 2 . 

<€[0,1] 

Let 

9i = l/2o-cr (0 ’ 1) , 

g 2 = erf 1 ’ 0 ) + afj(O’l) - 1/2 CJ ( (j(0’l)) 2 , 

93 = l/6(cr(cr(°’ 1 )) 2 + cr 2 (J (0 ’ 2) ), 

94 = l/2(a( 1 ’ 0 )+aa(°’ 1 ) + l/2aV 0 ’ 2 )), 

9b = Q- 
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Fix t £ [ti,ti- )_i] and put 

A = a{t h X WP (ti)) - a(ti,X WPt (ti)) 

-a^ituX^iU)) • (X WP (t;) - X WPt (t;)), 

S = <7(i,, X WP (t*)) - ^(ti, * Wt (ti)) 

— cr^ 0,1 ^(^,X WPt (^)) • (A WP (f z ) -X WPt (f z )) 

as well as 

U = (X WP ( tl ) - X aux (f,)) • (1 + a^fax^iu)) • (t - t,)) 

and 

V = A-{t-ti) 

+ ( < 7(°> 1 )(t,,X WPt (t,)) • (^ WP (t/) - X au *(ti)) + B) • (TD(f) - VF(f,)) 
+ ( 5l (t,,X WP (t,)) - 9 i(^,X WPt (f z ))) • ((VF(f) - ID(f z )) 2 - (i - t,)) 
+ ( 52 (ti,X WP (t,)) - 92 (f,,X WPt (f z ))) • (W(t) - W(t,)) • (t - ti) 

+ ( 53 (ti,^ WP to)) ~ <? 3 (^ WPt (iz))) • (W(t) - W(ti)) 3 

+ (94(ti,X WP (ti)) - g 4 (ti,X WPt (ti))) • ( t-ti ) 2 

+ ( 55 (iz,X WP (tO)~ff5(th^ WPt (iz)))- [\w(s)-W(t l ))ds. 

Jti 

By definition, 

X WP (t)-X aux (t) = C/ + F. 

Due to (A), we have 

I g n (ti, X WP ( t{))-g n {t h X WPt (ti))| 

< c ■ (1 + |A WPt (tz)| 2 ) • |X WP (fO - X WPt (f,)| 
for n = 1,..., 5, and 

|A| < c ■ |X WP (f,) - X WPt (f,)| 2 , \B\ < c ■ |A WP (f z ) - X WPt (f z )| 2 . 

Put q = [p/2] and note that p<2q< 2p. For r = 1,..., 2g, we obtain 

i u\ r < (i+c ■ (t - 1 ,)) • ix wp (t z ) - x aux (f Z r 

as well as 

< c • i^ WP (to - x aux (tor • (t - ^ r/2 
+c.|x wp (^-x wpt (^r-(t-tir /2 
+ c • (1 + |x WPt (^)l 2r ) • |^ WP (tz) - * WPt (tz)r • (t - uy. 
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Moreover, 

\E(V\T tl )\ = \A-(t- U) + - <74 fa, X WPt (U))) ■ (t - U) 2 \ 

< c- |X WP (^) — X WPt fa)| 2 • (t — ti) 

+ c • (1 + |X WP fa)| 2 ) • |* WP fa) - X WPt fa)| • (t - t^ 2 . 

Hence, 

E(\X wp (t,)-X m (t,)\ 2 «\n,) 

=Ef 2 r 5 ) 

r =0 V ' 

< (1 + c • (t - (,)) • |V wp (t,) - -Y““(t,)| 2 ’ 

+ c ■ jr ( 2 «) ■ |X wp (t,) - X““(t,)| 2 » ■ (t - t I ) r/2 

+ c - 2q • It/I 2 ’- 1 • (1 + |X wp (t,)| 2 ) • |X WP (*,) - X wp ‘(t,)| • (t - ti) 2 
+ p 'E( 2 r ‘ ! )-TI 2 ’-’'.(i + |x wp ( tl )| 2 ’') 

r=2 ' ' 

x|x wp (^)-x wpt (tor-(t-^r 

+ c - £ ( 2 r 9 ) • \ u \ 2q ~ r ■ l^ WP fa) - ^ WPt fa)l 2r • (t - tty' 2 

r =1 ^ ' 

< (1 + c ■((-(,)) - |X wp (f,) - X““((,)| 2 ’ 

+c-Y,( 2 y-\u^--(i + \x^( tl r ) 

r= 1 ' ' 

x|x wp (^)-x wpt (tor-(i -^) 1+r/2 

+ c- £ Cr) • l C/ | 2< '“ r • l xWP (^) - ^ WPt fa)T • (t - t l ) ma ^ 1 ’ r / 2 \ 
r =1 ^ ? ' 

By Lemma 10, we get 

-B(it/i 2s - r • (i +1 v wp (t,)i 2r ) • i/t wp (t,) - v wpt (t,)n 

< moilr •(£((! + iv wp (*,)i 4 ’) • ix wp (*,) - x wpi (« I )i 2,, ))’' /<2 ' ,) 

< llcilT • (1 + |||x wp (t I )||| 2 ;) ■ |||x wp (t i ) - V wp *(t,)|||3 a 
<c-iiix WP (i i )-x*“(t,)ii;- r .Au 
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and similarly 

E(\U\ 2q ~ r • | X WF {ti) - X WPt (t;)| 2r ) 

Thus, 

ll|X wp (t) - X»“(t)g 

<(i+c.(t-t 1 ))-ix wp (t 1 )-x‘“(t i )e 

+c-(t-t,)-Y,( 2 !!) 1 x wp (t,j - x“(t,)iig- r ■ 

r—l ' ' 

<(i+ c .(t-t i ))-ix wp (t 1 )-x‘“(t i )e 

+ c ■ (t. — t,) ■ (||X WP ((,) - X*“(t,)||| 25 + AUl?" 

< (1 + c-(t- t|)) ■ |X wp (i,) - x*“(t,)|||| + c ((-*,)• A 3 „V- 

Now apply Gronwall’s lemma to complete the proof. □ 
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